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ABSTRACT 

We use maximum entropy arguments to derive the phase space distribution of a virialized 
dark matter halo. Our distribution function gives an improved representation of the end prod- 
uct of violent relaxation. This is achieved by incorporating physically motivated dynamical 
constraints (specifically on orbital actions) which prevent arbitrary redistribution of energy. 

We compare the predictions with three high-resolution dark matter simulations of widely 
varying mass. The numerical distribution function is accurately predicted by our argument, 
producing an excellent match for the vast majority of particles. 

The remaining particles constitute the central cusp of the halo (<, 4% of the dark mat- 
ter). They can be accounted for within the presented framework once the short dynamical 
timescales of the centre are taken into account. 



1 INTRODUCTION 

For over two decades it has been possible to use numerical meth- 
ods to model systems of cold, collisionless dark matter particles 
collapsing under gravity to form stable, virialized 'halos' (Frenk 
eTaI]l 985; rDubinski & Carlberg| 1991; for a review see Frenk & 
White 2012). The density of these halos declines with radius fol- 



lowing a slowly changing power law dependence, rou ghly p ~ r 
at small radii and p ~ in the outer regions {Navarro et al. 



|1996b||Kravtsovetal.[ l997 ; Moo re et al.|1998| >. Despite some early 
uncertainty, recent simulations with independent computer codes 
all reproduce this result (e.g. Diemand et al. 2008 ; Navarr o~et al.| 
|2010| [Stadel et al.|2009| >. The universal behaviour seems to be in- 
dependent of the power spectrum of initial linear density fluctua- 
tions JMoore et al.|1999[|Reed et al.|2005||Wang & White|2009) 
as well as the mass of the collapsed object and the epoch of col- 
lapse, although together these deter mine a scale radius for the tran- 
sition from r -1 to r~ 3 behaviour {Cole & Lacey||l996[ |Navarro 



eTaT1[T997l TBullock et al.1|2001b| [Ekeet al.|200ll |Macci6 et al. 



2007[ >. Even simulations of 'cold collapse', for which initial condi- 



tions consist of a homogeneous sphere of particles, seem to produce 
similar universal profiles ( |Huss et al.|1999| >. 

It remains an outstanding question, however, whether this uni- 
versality can be adequately explained from first principles. Until 
this question is answered, we do not fully understand what the uni- 
versality means and must rely on new simulations to predict the 
effect of changes in the initial conditions or particle properties. 
The experimental result that monolithic collapse produces the same 
types of system as hierarchical merging jHuss et al. |T999"1 |Moore1 
et al. 1999, Wang & White 2009) is provocative: it means that any 
explanation for universality which invokes a specific cosmology 



(e.g.|Syer & White|1998| [Dekel et al.|2003] |Salvador-Sole et al.| 



|2012 1 must be describing a special case of a more general process 
l |Manrique et al.|2003| >. 



Attempts to understand collisionless gravitational collapse's 
insensitivity to initial conditions were pioneered by Lynd en-Bell| 
l |1967| l in the context of self-gravitating stellar systems. Adopting 
Boltzmann's procedure for deriving the thermodynamics of col- 
lisional systems, Lynden-Bell maximized the entropy of the sys- 
tems subject to fixed energy. This implies a density profile obeying 
p(r) °= r~ 2 , so disagrees with the results of numerical experiments. 
Moreover this approach gives rise to a number of physically ques- 
tionable conclusions (for reviews see Padmanabhan 1990; Lynden- 
|Bell| 1 9 99 >. The clearest of these is the 'gravothermal catastrophe': 
entropy can be increased without bound by transferring energy 
from the innermost orbits of a self-gravitating system to the out- 
ermost orbits < |Lynden-Bell & Wood|1968| |Tremaine et al.|1986| >. 
This implies the existence of a runaway physical instability in 
which the majority of material collapses into an exU'eme central 
density cusp or black hole. 

Observations and numerics both suggest that something pre- 
vents the above catastrophe from occurring on any reasonable 
timescale; in other words, a physical constraint is preventing the 
arbitrary redistribution of energy ^White & Narayan[T9 87 1. In the 
spirit of Jaynes ( 1957) a general explanation for the final state can 
still be based on the ideas of statistical mechanics but in the pres- 
ence of constraints other than energy. The additional constraints 
will represent the incompleteness of the energy equilibriation ef- 
fects of violent relaxation. 

This is the approach we adopt in the present work. The likely 
distribution of particles in phase space is selected by maximizing 
entropy, 



://ln/d f 



CO. 



(1) 



where k is Boltzmann's constant and f((o) is the probability of 
finding a particle in a specified region of phase space CO, subject 
to relevant constraints on the desired solution. (This approach can 
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be motivated by showing that the vast majority of states consistent 
with a given set of constraints are to be found near the maximum 
entropy solution; see Appendix [A] for further discussion and refer- 
ences.) 

The constraints applied arise from the dynamical evolution of 
collisionless systems. We will argue that, in the late stages of vio- 
lent relaxation, there is a diffusion of particles in phase space which 
approximately conserves the sum of orbital actions. This sum is 
progressively better conserved as equilibrium is approached and 
therefore its role in establishing that equilibrium cannot be ignored. 
Applying the maximum entropy recipe, we will show that the phase 
space structure of equilibrium halos is then reproduced over orders 
of magnitude in probability density. To our knowledge, this is the 
first instance of maximum entropy reasoning, applied to 6D phase 
space and subject to physically motivated constraints, producing 
such success in a collisionless system. 

The remainder of this work is organized as follows. First, the 
conservation of action is discussed (Section Then a canoni- 
cal ensemble constructed on this basis (Section [2.2| l yields a phase 
space distribution in quantitative agreement with high resolution 
numerical experiments (Section[3j. A discrepancy affecting a small 
fraction of particles at low angular momentum is highlighted in 
Section [33] Finally we discuss the predicted radial density profiles 
which again highlight the need for special treatment of low angular 
momentum orbits (Section |3.5[ >. We conclude in Section]?] 



2 THE ANALYTIC PHASE SPACE DISTRIBUTION 
2.1 Conservation of action 

Our first task is to identify and explain some relevant quantities 
which should be held fixed when maximizing entropy. This will 
be central to our argument because such constraints represent the 
incompleteness of violent relaxation's tendency to redistribute en- 
ergy, generating a different solution from the one based on energy 
conservation alone. With this in mind we will show that the radial 
action J r (to be defined below) is conserved in an average sense 
even during rapid potential changes. 

This average conservation does not appear to have been dis- 
cussed elsewhere in the literature. We will first show how it can 
be derived from previous work (Pontzen & Governato 20121 when 
the potential changes instantaneously, maintaining the sphericity of 
the halo. Then a more general (but more abstract) argument will 
be given which additionally shows that the other two actions (the 
z-component of the angular momentum j, and the scalar angular 
momentum j) are also conserved in the same average sense. The 
second approach encompasses perturbations to the potential which 
have variations on arbitrary timescales and may break spherical 
symmetry. However the first has a more intuitive content and there- 
fore forms our starting point. 

In a spherical system, the radial action J r is defined by 



Jr 



2£-2<5(r;r)- j 2 /r 2 dr, 



(2) 



Here E is the specific energy, j is the specific angular momentum 
and 4> is the potential at a given radius r and time r; the r integral 
is taken over the region where the integrand is real. 

The radial action J r has the same units as specific angular mo- 
mentum j. This reflects the similar conservation roles these two 
quantities play for the radial and angular components of the motion. 
In particular is exactly conserved if any changes in the potential 
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Figure 1. An illustration of the diffusion of particles in action space. 
Here particles in the inner 10 kpc of a simulation have been selected and 
their radial actions J r numerically calculated at two timesteps separated 
by Ar = 2.7 Gyr. The change in the population mean action is small (p = 
— 12.1kpckms~') compared against the magnitude of the random diffusion 
(CT = 287kpckms -1 ). 



occur sufficiently slowly ('adiabatically') in time (e.g. jBinney &] 
|Tremaine|1987) . 

On the other hand in the rapid, impulsive limit under an in- 
stantaneous change in energy E — > E + AE and potential 4>(r) — >• 
<5(r) + A<J>(r), the action of the particle is changed at first order: 



AJ r 



dJ r 

Je 



AE- 



j; 



drA4> 



Sj r 
5* 



(3) 



In Pontzen & Governato (2012) we showed that the energy shift AE 
induced by the change of potential, averaged over possible orbital 
phases of the particle, is 



<AE> = - 



| °°drA* 8J r /8<S>\ E 
dJr/dE^ ' 



(4) 



an exact result (see equation 12 of |Pontzen & G overnato 2012| >. 
Here angular brackets denote averaging over all possible phases of 
the orbit. Considering the probability distribution of radial actions 
after this change, one has (AJ r ) = at first order, by substituting 
equation ^ in (]3j. Even though a specific particle will change its 
radial action, the ensemble average is conserved. 

This result connects closely with the standard adiabatic argu- 
ment that A7,- = if any changes to the potential occur on long 
timescales. In our case, however, the necessary 'phase averaging' 
does not occur over time for an individual particle but instead via a 
statistical consideration of an ensemble of particles spread evenly 
through all possible phases. 

We can generalize as follows. Adopting the complete set of 
action-angle coord inates for phase space (e.g. [Park 1990; Binney| 
& Tremaine|l987[ l, the momenta are /= (J r ,j,jz) where j is the 



total angular momentum and j z is its component in the z direction 
(so — j < j, < j). The conjugate coordinates are = (\ff r ,(j),x), 
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taken to be periodic with interval 2n, and the Hamiltonian is 

H(J,@) = H Q (J,t) + h(J,&,t). (5) 

Here h is an arbitrary perturbation. It may consist of a long-lived 
term (perhaps a departure from spherical symmetry) and fluctua- 
tions on arbitrary time-scales. In the background, the coordinates 
change at a constant rate, 



= © o + £2f, 



(6) 



and the frequencies for the radial and azimuthal motion Sl r and Slj 
obey 



Sl r ■- 



SI: 



(7) 



by Hamilton's equations. The frequencies may change slowly with 
time (dSl/dt <C SI 2 ). J is conserved in the background but when h 
is non-zero, the equations of motion read 



dJ 
dt 



dh_ 



(8) 



Because equation |6]( shows that particles in the background move 
at a uniform rate in the coordinates, an equilibrium distribution 
/ has no dependence; i.e. the density / is a function of /alon^] 
From this it follows that 



dt 



(J) = -/d 3 /d 3 0/(7)||=O 



(9) 



where the result is obtained via integr ation by parts. This mea ns an 
individual particle's J can 'diffuse' |Binney & Lacey||l988t over 
large distances o" in action space, 

a = {{8J) 2 ) l l 2 = ff{e), (10) 

compared to the variation in the mean fl of the population, 



H=(8J) = ff{E 2 ). 



(11) 



We can inspect this diffusion in a simulation by calculating the rel- 
evant actions at two timesteps. As an example, Figure [T] shows a 
histogram of changes in the radial action of tightly bound particles 
in the forming "Dwarf" halo (see Section|3j between z = 3.1 and 
z = 1 .4, a time interval of approximately 2.7 Gyr. We define 'tightly 
bound' by selecting particles interior to lOkpc at the earlier time 
step, and calculate the J r values of these particles in both outputs 
according to the numerical recipes given later (Section [3. l[ i. 

As expected from the linear analysis, Figure [T] shows that in- 
dividual simulated particles change their actions more rapidly than 
the population mean. Quantitatively, the change in the population 
mean fl is — 12.1kpckms _1 whereas a typical particle has moved 
by a = 287kpckms~ 1 ^> from its original value. Note also 
that the mean J r value for these particles in the final timestep is 
(J r ) = 187kpckms~' < rj, meaning particles really do cover sig- 
nificant distances in action space. 

This analysis confirms that (J) evolves slowly and, although 
it is not exactly conserved, it forms a constraint of motion that can- 
not be ignored on finite timescales. A complete description would 
require investigation of different moments of the distribution at 



' The conjugate coordinate to j z is also a constant of motion in the back- 
ground, so this argument does not strictly show d(j z ) /dt = 0. However (j z ) 
must be exactly conserved anyway if the perturbations are internally gener- 
ated, since it is proportional to a component of the total angular momentum 
vector. 



higher order in perturbation theory. For now, however, we have mo- 
tivated a picture in which (J) evolves sufficiently slowly compared 
to the diffusion of an individual particle that it must be considered 
fixed in analysis of the distribution. 



2.2 The new canonical ensemble 

In the Introduction we explained that, to obtain a phase space distri- 
bution function, we will maximize the entropy {TJ subject to con- 
straints on the particle population (further discussion is given in 
Appendix |A) . As well as energy conservation, we apply the 3- 
vector of constraints on (J) discussed above. This gives rise to a 
total of four Lagrange multipliers in the resulting distribution func- 
tion: 



/(./)<* exp (-P-J-PeE(J) 



(12) 



where the Lagrange multipliers are j8 = (ft, ft, ft) and ft. In the 
absence of the new constraints, jS = and ft is identified with 
1 /kT (where T is the thermodynamical temperature). 

All four constants can be determined in a variety of ways de- 
pending on the situation; for a complete account of structure forma- 
tion one would like to be able to derive them from the initial condi- 
tions, but this lies beyond the scope of the current paper (although 
see Section [3~4| for further comments). The lack of any reference 
to in equation \\2\ indicates that the solution is phase-mixed, as 
required for equilibrium. 

Equation l | 1 2| i is the essential prediction of the present work. 
As with any prediction derived from a maximum entropy argu- 
ment, it will be able to fit the actual ensemble only if we have en- 
capsulated enough of the dynamics within the constraints ( |Jaynes| 
|1979b| l. The rest of this paper is concerned with testing to what 
extent that is the case. 

First consider the probability of finding a particle with J r in a 
given interval (ignoring j and j z coordinates). This is given by 



p r (J r ) = [ 2K d 3 & [~dj f J dj z f(J), 
Jo Jo J-j 



(13) 



because the action-angle coordinates are canonical, so the phase- 
space measure is constant. In the limit that the energy of the system 
becomes large at fixed action, equation |T3j can be solved: 



p r (J r ) oc exp-ft7 r 



(ft=0), 



(14) 



but it is not immediately clear whether we will be operating in this 
regime. More generally a closed form for p r (J r ) is hard to obtain, 
but we can at least show that 



dlnp,. 

dJ r 



-ft{n,) y ,.-ft, 



(15) 



which can then be integrated numerically for a given case to give a 
concrete comparison between equation l | 1 2[ i and simulations. Here 
we have defined 



{Sl r )j r = — f f(J)Sl r (J)djdj z 



(16) 



which is the mean of the radial frequency for particles with a fixed 
J r . With this definition, relation l |15| > can be derived from equation 
< | 1 3 1 > , recalling that the radial frequency Sl r of the particle's orbit 
obeys equation |7}. We will investigate and explain the distribution 
of J r values predicted by equation l |15| l in Section [3~2| 

Now consider the distribution of total angular momentum j. 
We will follow exactly the same series of manipulations as for the 
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Dwarf MW Cluster 

r 200 = 98kpc; M 2 oo = 2.8 x 10 l0 M a ;c= 19.6 r 20 o = 301 kpc; M 20 o = 8.0 x 10 11 M G ; c = 15.5 r 200 = 1.43 Mpc; M 20 o = 8.7 x 10 13 M 1? ; c = 9.9 
Ar part = 3.4xl0 6 ;£ = 65pc. W parl = 5.3 x 10 6 ; e = 170pc. N pm = 8.9 x 10 s ; e = 690pc. 



Figure 2. Images of the three dark matter simulations, accompanied by numerical properties. Respectively r 2 oo, Mjoq, c, W palt and e denote the radius at which 
the density exceeds the critical density by a factor 200; the mass within this radius; the 'concentration', c = J"20o/'s wnere r s is the NFW scale radius as 
described in the text; the number of particles within r 2 oo; and the gravitational softening length in physical units at z = 0. The images are scaled to show the 
virial sphere of the main halo. The brightness represents the column density of dark matter (scaled logarithmically to give a dynamic range of 3000 in each 
case); the colour corresponds to a density-weighted potential along the line of sight. 

now compare that expectation against simulated dark matter halos. 
Our strategy is to integrate equations \15\ and ^19\ numerically for 
these simulations and compare to the actual distribution of particles 
binned by J r and j respectively. 

We will present results from three simulated dark matter ha- 
los (shown in Figure [2}, chosen to span a wide range of masses 
with an approximately constant number of particles per halo (sev- 
eral million in each case). We also compared our results against 
the GHALO multi-billion-particle phase space (Stadel et al. 20091, 
finding good agreement similar to that described for our "MW" 
halo here. This gives confidence that the mechanisms and results 
discussed in the paper are not sensitive to numerical resolution. 

Our simulations are run from cosmological initial conditions 
at z — 100 in a 'zoom' configuration (Navarro & White 1993}, i.e. 
with high resolution for the main halo and its immediate surround- 
ings and lower resolution for the cosmological environment. The 
softening lengths e for the high resolution region are listed in Fig- 
ure|2]and are fixed in physical units from z = 9, prior to which they 
scale linearly with cosmological scalefactor, a compromise moti- 
vated by numerical convergence studies (Diemand et al. 2004}. We 
verified at the final output (z = 0) that the high resolution regions 
have not been contaminated by low resolution particles, and that 
the halo real-space density profiles are well described by a slowly 
rolling powerlaw, in accordance with all recent simulations (e.g. 
IDiemand et al.|2008| [Stadel et al.|2009"l|Navarro et al.|2010| and 
references therein). 

Each simulation output contains full cartesian phase space co- 
ordinates (x, v). The position space is re-centred on the central den- 
sity peak of the halo using the 'shrinking sphere' method of Power 
|et al.| j2003 1. The velocities are re-centred such that a central sphere 
of radius r2oo/30 has zero net velocity, where ^200 is the radius 
at which the mean halo density is 200 times the critical density. 
Henceforth we only consider particles inside r2oo- 

From left to right in Figure[2]the simulated halos become more 



radial action; however in this case the marginalization over j z in- 
troduces a non-trivial term: 

r 2% 



P j (./) = jf d 3 ®J^ dj z jf" dJ r f{J) 



cxsinh( J 8,j)exp(- J 6 / j) j d/ r exp (-&£) . (17) 
Once again, in the case Pe — > 0, we have a fully analytic expression 

P j(j)c S mh(p z j)cxp(-pjj) (fe=0), (18) 

which will serve as a useful point of comparison. More generally 
we can differentiate equation {17} to obtain 

^ = £coth(/y) - fe(Q;>; - fr, (19) 

where Q.j is the angular frequency of the orbit and 

{aj)j = j f{J)£lj(J)AJ r &j z (20) 

is the mean angular frequency of particles at fixed j. Equation |T9} 
for the angular momentum distribution (ignoring all other coordi- 
nates) is the equivalent of equation {T5} for the radial action dis- 
tribution. Once again we will investigate and explain the shape it 
predicts in Section [33] First, however, we will explain the simula- 
tions which serve as a point of comparison for the later discussions. 



3 COMPARISON TO SIMULATIONS 
3.1 Overview of the simulations 

In the previous section we applied maximum entropy reasoning to 
conservation of energy and approximate conservation of action to 
derive an expected equilibrium phase space distribution. We will 
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Figure 3. The distribution of particles' radial action ],- (left panel) and scalar angular momentum j (right panel) in MW (both panels show the distribution 
of particles as a histogram). Also shown are maximum entropy solutions based on energy conservation alone (dotted curve); action conservation alone 
(dashed curve) and our advocated solution using both constraints (thick solid curve). The last of these provides a good reproduction of the distribution for 
J, < 7 r ,brcak = 3 x 10 4 kpckms~' and j < y'break = 4 x 10 4 kpckms~' respectively, while extending over orders of magnitude in probability density. Less than 
0.1% of particles lie at J,- > /,-break; approximately 0.2% lie at j > ./'break- Despite the good overall agreement, problems become apparent at very small j; a 
blowup of the indicated range j < 3500kpckms~' is given in Figure 4. 



massive. The width of each panel is equal to 2r2uo and the luminos- 
ity is scaled to represent the column density over a dynamic range 
of 3000. The most conspicuous aspect of Figure 2 is that the ha- 
los become less centrally concentrated. We verified this by fitting a 
classic "NFW" (Navarro et al. 1996b) formula to the density pro- 
file. The NFW fit, 

(l + r/r s ) z (r/r s ) 

yields po, a characteristic density, and r s , a scale radius. The latter 
is often expressed in a scale-free manner as a concentration value 
c = f2oo/r s ; we have recorded the value for each halo in Figure [2] 
As expected the concentration decreases with increasing mass, in 
agreement with previously known trends (e.g. Macci 6~et al. 12007) 
Bullock et al. 2001b I. We thus have a sample of cosmological halos 
which span a wide range in both mass and concentration. These 
different concentrations are thought to arise from different mean 
densities in the universe at the epoch of collapse (Navarr o" et al.| 
[79971 |Bullock et al.|2001b"> . 

All halos in Figure [2] exhibits large amounts of substructure; 
we will present results with this substructure subtracted, although 
we have verified that including the substructure does not have a 
qualitative impact on our results. The substructure is identified and 
removed using the "Amiga Halo Finder" (Knollmann & Knebe 
[2009l >. For each remaining particle inside rino, the specific scalar 
angular momentum is given by j =\vxx\. We calculate J r by eval- 
uating equation Q numerically, using a spherically-averaged po- 
tential $ defined by 

• W =/V«£4 (22) 
,/0 r 

where M(< r 1 ) is the total mass enclosed by a sphere of radius 
r 1 , and the specific energy E of each particle is defined as E — 



v 2 /2 + <&(/). J r is evaluated using the true spherical potential out 
to r term = 3r2oo> beyond which (for reasons of numerical speed) 
the calculation is truncated and an analytic completion assuming 
a Keplerian (vacuum) potential is taken. We verified that changing 
'"term to 4>2oo had little impact on the results. 

Before proceeding to a comparison, we need to derive appro- 
priate jS values. We calculate these using a Monte-Carlo Markov 
chain (MCMC) to maximize the likelihood 

J?(P,M = I\f( f <<P>M (23) 

i 

where / is the 1 -particle distribution function \\2\ normalized such 
that / d 3 /d 3 0/(7) = 1. This normalization must be accomplished 
numerically on a grid of J r ,j values; at each grid-point E(J r ,j) 
is calculated by operating a bisection search on equation (|2j. This 
need only be done once, and then the evaluation of each link in the 
Markov chain is rapid. 

The operation gives us maximum likelihood (i.e. "best fit") 
parameters 2 (|3/, p\,/3,-, jBg) for a given simulation, optionally sub- 
ject to constraints (such as jS^ = or j3y = /3 Z = jS r = 0). We are 
now fitting up to four parameters (excluding mass normalization), 
more than the one or two parameters normally used by simula- 
tors to describe their halos (e.g. Nav arro et al.|20 04; Stade l et al.| 
2009 ). However the fitted real-space density profiles are purely phe- 
nomenological constructs; conversely here we are starting with a 
functional form derivable from physical considerations. As we have 
commented in Section [2~2| and will expand upon in Section [3~4| the 
j8's should ultimately therefore be derived from initial conditions. 



2 The MCMC technique also yields uncertainties on the j8 values, but these 
will not be considered further in the present work. 
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For the present, however, the objective is to see whether our physi- 
cal argument can correctly describe the phase space distribution at 
all, for which fitting j3's is the most pragmatic approach. 



3.2 Comparison with MW: J r distribution 

We will now start to test how closely equation (12) represents the 
distribution of particles in our simulations. We will investigate MW 
in some detail, before showing results for the other two simulations 
to which the same discussion can essentially be applied. 

The distribution of J,- values in MW is shown by the histogram 
in the left panel of Figure [3] This can be compared with the thick 
solid curve which shows the distribution of J r values according to 
expression (15) ; the parameters are jS^ 1 = 1.6 x 10 4 km 2 s~ 2 and 
/3 r _1 = 4.4 x 10 3 kpckms _1 . The agreement is excellent over sev- 
eral orders of magnitude in probability density, spanning the values 
< J r < ./break where /break — 3 x 10 4 kpckms _1 . Particles with 
J r > /break account for less than 0.1% of the mass and are on long 
period orbits, probably reflecting new material falling into the po- 
tential well. We will not consider them further. 

The shape of the J r solution can be understood as follows. We 
have already remarked that, in the limit Pe — ► 0, one recovers equa- 
tion {14}, an exact exponential (i.e. a straight line on the linear-log 
axes of Figure[3}. For comparison we have plotted the best fit distri- 
bution of this form (with j3~ J = 3.5 x 10 3 kpckms _I ) as a dashed 
line. Since the period of an orbit increases with its energy (or ra- 
dial action), the mean frequency (Q. r ) j r decreases for increasing J r . 
So, inspecting equation (15) , there will always be a J r value above 
which PE{Qr)j r becomes much smaller than p r . Looking again at 
the thick solid curve in Figure [5] the limiting solution at high J r is 
indeed a pure exponential as this reasoning would suggest. At small 
J r , however, the gradient of the solution is steeper because of the 
energy term. 

Comparing the histogram, the thick solid curve and the dashed 
line in the left panel of Figure|3]thus leads us to the conclusion that 
J r conservation (dashed line) accounts rather well for the qualita- 
tive form of the distribution, with an important correction from E 
conservation at low J r . Finally the dotted curve shows the best fit 
case with j8 r = - i.e. the normal statistical mechanical result in 
the absence of other constraints - and provides a poor fit at all J r . 
In summary, the identification of the J r constraint has resulted in 
dramatic improvements in the match to simulations. 



3.3 Comparison with MW: j distribution 

Now consider the right panel of Figure [3] which shows the distri- 
bution of scalar angular momentum for the particles in our MW 
simulation. Once again the simulated particles are shown by the 
histogram; the best fit maximum entropy solution (Pj ,Pr l = 

4.4 x 10 3 ,1.1 x 10 4 kpckms~' , with Pe as quoted above) is shown 
by the thick solid curve. It again reproduces the correct qualita- 
tive behaviour up to /'break = 4 x 10 4 kpckms~', with only 0.2% 
of the mass at j > /'break ■ Although the angular momentum distri- 
bution has some fluctuations away from the predicted behaviour, 
the predictions remain nearly correct over two orders of magnitude 
in probability density. With the exception of a problem described 
below, we do not believe these fluctuations to be of particular im- 
portance beyond indicating the structure is not completely relaxed. 
In particular we will show later (Section \3.5\ that these inhomo- 
geneities can be ignored when reconstructing a density profile in 
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Figure 4. Despite good agreement over the majority of j space (see right 
panel of Figure]!}, the fraction of simulated orbits (histogram) at very low 
angular momentum is substantially underestimated by the simplest maxi- 
mum entropy argument (thin dotted curve). One fix discussed in the text is 
to postulate a second population at low energies (dashed curve). This yields 
a much better low- j fit (solid line) without affecting the high- j fit (except 
through a minor renormalization). 



real space. Certainly compared against a solution based on E con- 
servation alone, again shown by a dotted curve, our solution can be 
counted a success. 

The basic shape of the predicted /' distribution can be under- 
stood in a similar way to the J r distribution explained above. Con- 
sider again the case where Pe = (so in effect the total energy is 
unconstrained); then the exact solution is given by equation (18) . 
We can also take the isotropic limit, j3 z — > 0, giving 

pj (j) - j exp(- j3, 7 ) (ft = 0, p E = 0). (24) 

This is analogous to the radial action case (14) , but with a degen- 
eracy factor j reflecting the increasing density of available states 
available as the angular momentum vector grows in size. The result 
is that the abundance of particles grows linearly with j for j < jS^ 1 
and decays exponentially for j > fij . 

In light of the above discussion, it is notable that the turnover 
from growth to decay in pj(j) occurs at j values much smaller than 
Pj ■ There are two ways to accomplish this. The first is to create 
a highly anisotropic setup, P~ <C /'n, where jo is the smallest j 
value of interest. This packs orbits as much as possible into a single 
plane, generating a large net angular momentum and destroying 
the approximate spherical symmetr)]^] but effectively removing the 
degeneracy in j altogether: 

(j37 1 </, fe = 0). (25) 



Pj U)°<exp[(p z -pj)j) 



Because it is maximally anisotropic, this solution cannot reflect 
the simulations; however if we temporarily fit only j values using 

3 We note in passing that, technically, distribution functions with net angu- 
lar momentum can nonetheless generate spherical potentials t Ly nden-Bell| 
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Figure 5. As Figure]^] but for the remaining two simulations. Once again, the maximum entropy distribution subject to /and E constraints (thick solid lines) 
predicts the simulations (histogram) accurately. The simulated distribution in J, for the cluster (lower left panel) has some noticable fluctuations over large 
scales; this is likely because it is dynamically young. 



the functional form (T8},_we are pushed towards this unphysical 
limit (dashed line, Figure 3 right panel; (jSr 1 , J3" 1 ) = (5.9, 6.3) x 
10 2 kpckms- 1 ). 

Luckily this is not the only way to overcome the shrinking 
phase space at low j. Equation {19} shows that if (£2/)/ increases 
fast enough as / -> it can overcome the coth(j3 z /) degeneracy 
term. We have verified that in the full solution (thick solid line in 
Figure |4]right panel), this is the mechanism by which the turnover 
is pushed to low j. 

Focussing attention on the low- j part of the distribution does, 
however, reveal a deficiency in our predictions. Figure [4] shows 
the distribution of orbits with j < 3500kpckms _1 . The dotted line 
shows the same maximum entropy fit depicted by the solid line in 
Figure [5] When the horizontal scale is expanded in this way, it be- 
comes clear that the global fit undershoots the simulation values 
significantly at low j. This appears to be a systematic feature of all 
simulations we have inspected (the three detailed here, GHALO, 
and various other lower resolution simulations which we used for 
testing purposes). It is possible to force a better fit by restricting the 
likelihood analysis to this region, but the global agreement is then 
considerably worse. 

This suggests that the behaviour at low j is marginally decou- 
pled from that in the rest of phase space. This could arise from 
the wide range of orbital periods: particles with small actions also 
have periods much shorter than the rest of the halo. The coupling 
between particles will necessarily be weak if their timescales are 
very different (since particles on short orbits react adiabatically to 
fluctuations on long timescales). This can substantially suppress re- 
distribution of scalar angular momentum and is consistent with, al- 
though not reliant on, the early formation of a stable central cusp 
in simulations (e.g. |Moore et al.|1998"l|Lu et al.|2006||Wang et al.| 
[2011) . 

In principle this weakness of coupling between orbits in dif- 
ferent regions could be expressed as a further constraint in the 



maximum entropy formalism. Further investigation awaits future 
work, but for now we will use this as a motivation to study a two- 
population system. We are not suggesting that there really are two 
sharply defined populations, but that this should anticipate the fea- 
tures of incomplete equilibrium. 

Our maximum likelihood analysis is able to find a dramat- 
ically better fit in this case, placing 3.5% of the mass in a sec- 
ond population at substantially lower temperature (jS^ 1 = 2.8 x 
10 3 km 2 s~ 2 ). The summed distribution is shown by the thick solid 
line in Figure[4] with the contribution from the subdominant popu- 
lation indicated by the dashed line. Because the second distribution 
is so peaked near j = 0, the only difference at high j is a marginal 
renormalization. We also verified that the J r distribution is barely 
affected. 

It is undeniably disappointing that our solution does not auto- 
matically accommodate the behaviour at very low j, but we expect 
that future development of the ideas above can quantitatively ac- 
count for the discrepancy. We consider other possible explanations 
in Section [4] However after focussing so much on one corner of 
phase space we should re-emphasize the major conclusion: the dis- 
tribution over both J r and j for 96% of the particles are remarkably 
well described by the maximum entropy expression (12}. 



3.4 Other simulations 

We confirmed that these basic conclusions persist in other sim- 
ulations. The top row of figures in Figure [5] shows results from 
the 'dwarf simulation. The J r distribution (top left panel) again 
shows excellent agreement for J r < ./break* where ./break = 4 x 
10 3 kpckms -1 . Only a tiny fraction of mass (< 0. 1%) lies beyond 
this point of breakdown. As with MW, the dwarf's angular mo- 
mentum distribution (top right panel) has more conspicuous fluctu- 
ations, but still roughly adheres to the maximum entropy solution 
up to y'break = 3 x 10 3 kpckms _1 , with around 1.3% of mass lying 
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Figure 6. The best fit scales for radial action (upper panel) and energy 
(lower panel) as a function of mass. The crosses show the values from the 
three simulations, while dotted lines give the expected scalings \26\ and 
(27) , which a gree well with the simulations. 



beyond this point. We verified that at very low angular momenta 
j < lOOkpckms there is again an overabundance of particles in 
the simulation, although in this case it accounts for less than 2% 
of particles compared against the 3.5% in MW. The parameters of 
the dwarf fit are (fi- 1 , pr 1 , fif 1 ) = (4.2,1.9,2.1) x 10 2 kpckms-> 

andjS^ 1 =2.2x lO'kms- 1 . 

Considering the cluster simulation (lower row of Figure |5j 
gives similar results once again. This time the J,- distribution as 
well as the j distribution shows some notable fluctuations around 
the maximum entropy description. This may be because clus- 
ters assemble later (as we discussed in Section |3.1[ this is re- 
flected in the lower concentration value), so the system is dy- 
namically young; however we have not explicitly looked at time 
dependence of these distributions. The parameters of the clus- 



ter fit are (ftr 1 ,/Jr' ,j8; 



= (1.5,0.8,1.2) x KPkpckms- 1 and 
3.5 x lCPkms- 1 with 0.1% and 1.1% of the mass in the 



unrelaxed components beyond /break = 7 x 10 5 kpckms~' and j 
6 x 10 5 kpckms _I respectively. 

Naively one would expect jB^ 1 to scale approximately as 

GM 200 



Pe 



■M. 



2/3 
200' 



(26) 



since M200 and ^200 are by definition related through the fixed- 
mean-density condition M200 <* r^oo • Similarly the actions P~ l 
should scale as 



]8- 



^200 



GM200 
^200 



■M, 
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(27) 



Figure [6] compares these expectations with the actual values, al- 
though we immediately caution against taking the scaling of three 
halos too seriously. The upper panel shows the radial action, jS^ 1 , 
as a function of mass (crosses) with dotted lines indicating the scal- 
ing \21) . The lower panel shows the same for the energy scales. 
Both panels show good agreement with the expected trends. 



Figure 7. The real-space density distribution (upper panel) of the 1- 
component and 2-component maximum entropy solutions (dotted and solid 
lines respectively) compared to the MW simulation binned density profiles 
(dots). The softening length in MW is 170pc, so the profile should be reli- 
able exterior to ~ 700 pc (e.g. Power et al. 2003 1. The generic maximum en- 
tropy result is a density profile with slowly steepening powerlaw to increas- 
ing radii, in agreement with the simulations. The 1 -component fit misses the 
central density cusp, showing that the ~ 3.5% correction to the low angular 
momentum orbits (Section [3.3( is required to reproduce this quintessential 
feature of simulated dark matter halos. The lower panel shows the cumula- 
tive mass as a function of radius. 



For clarity we did not over-plot the j3y values in Figure [6] but 
these can be seen to be comparable to j3 f . Because cosmological 
halos are formed from near-cold collapse, their initial angular mo- 
mentum will be small. The final dispersion of angular momentum 
is likely generated through a weak form of the radial orbit instabil- 
ity (e.g.|Saha|1991||MacMillan et al.|200l^|Bl-ITovary et al.|2008[ 



|Barnes et al.|2009y Thus the scales of the angular momentum dis- 
tribution and the radial action distribution are likely to be intimately 
linked. This is one example of a dynamical consideration which 
should ultimately be used to link ft values to the initial conditions. 



3.5 Real space radial density profiles 

We have shown that a first-principles maximum entropy argument 
is capable of describing the phase space distribution of particles in 
dark matter halos, up to a small correction at low angular momen- 
tum. The natural next step is to ask what kind of real-space radial 
density profiles are implied by this phase space distribution and 
whether these match the classic rolling-powerlaw shape given by 
simulations. 

Calculating the density distribution corresponding to the phase 
space distribution \\2\ is technically involved; a description is 
given in Appendix [B] There we also explain how the same com- 
puter code can be used to calculate equilibrium density profiles 
from simulations (as opposed to analytic distributions). These pro- 
files are generated subject to our simplifying assumptions of phase 
mixing and spherical symmetry. They agree well with traditional 
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'binned' estimates of the density, validating the assumptions. Fur- 
thermore in the new method, each simulated particle is smeared out 
over its orbit, resulting in considerably smaller Poisson noise than 
from traditional binned estimates. 

Applying the algorithm to the maximum entropy solution, we 
find that the radial density profile implied by equation \ \2\ follows 
a shallow power law in the centre and steepens with increasing ra- 
dius, in qualitative agreement with the behaviour seen in numerical 
simulations ( |Navarro et al.|1997||2004l|Stadel et al.|2009| and ref- 
erences therein). However, when using the single population phase- 
space distribution fits, the central slope is too shallow (dotted line, 
Figure |7J. One can obtain higher central densities and inner slopes 
by changing the parameters, but then the outer slope becomes too 
steep. On the other hand, if one adopts the incomplete relaxation fit 
advocated in Section [3~3) a vastly improved real space density pro- 
file is recovered (thick solid line, Figure[7](. This confirms that the 
~ 3.5% population at low- j is responsible for controlling the cusp. 
In the discussion below we will recap our current understanding of 
this issue and give directions for future investigation. 



4 DISCUSSION 

We have shown that maximizing the entropy of a distribution func- 
tion subject to constraints on total action and energy reproduces 
the phase space density of particles in simulated dark matter halos. 
Crucially, there is a clear physical motivation behind this choice 
of constraints. We started by explaining that, since any equilibrium 
distribution must be phase-mixed, the late stages of relaxation ap- 
proach this phase-mixed state. As a consequence (J) becomes a 
conserved quantity as equilibrium is approached (Section [2T| . This 
constitutes a dynamical barrier to continued evolution, preventing 
energy from being further redistributed. 

The resulting canonical ensemble (i.e. the maximum entropy 
solution) is given by equation \\2\ . From it we derived two key 
relationships which can be used to test the phase space of simu- 
lated halos, respectively equations l |15fr and \19) . These were used 
to demonstrate a close agreement between simulations and theory 
(Section [3.2||3,3[ > over orders of magnitude in probability density, 
and over a wide range of halo masses from dwarf galaxies to clus- 
ters (Section [3~4} . We compared to the Lynden-Bell ( 1967 ) distribu- 
tion which is obtained when energy can be arbitrarily redistributed 
between particles, finding that our new canonical ensemble offers 
a vastly improved fit (see dotted lines in Figure |3J. This strongly 
suggests that (a) maximum entropy with suitable dynamical con- 
straints (representing incomplete violent relaxation) is a plausible 
route to understanding the 6D phase space of dark matter halos; (b) 
the newly constrained quantities need not be conserved in general, 
but must be conserved whenever the system is close to equilibrium, 
so that their value becomes fixed as the dynamics settle down; and 
(c) we have identified a physical argument leading to an important 
example of these constraints. 

However we found an overabundance of low angular momen- 
tum orbits in the simulations relative to the analytic predictions 
(Section [33}. This implies that there is at least one more impor- 
tant constraint that we have not fully reflected in our analysis. Con- 
structing the radial density profile (Section \3.5\ confirms that, al- 
though a small fraction (< 4%) of particles are causing the discrep- 
ancy, their existence is essential to understanding the origin of the 
central density cusps seen in numerical simulations. 

The correspondingly large density of particles as j —> has 
been found by previous work, notably Bull ock et al.| ( |2001a[ >, who 



offered a fitting formula which implies continually increasing par- 
ticle numbers towards j = 0. With our higher resolution simula- 
tions, we can see that Pj(j) does eventually decrease at sufficiently 
low /, but slower than expected given the shrinking available phase 
space (Figure |4j. Furthermore the large number of particles at low 
angular momentum can be linked directly to various discrepancies 
between ACDM theory and observed galaxies ( [van den Bosch et aT] 
|2001| [Dutton & van den Bosch 2009). In particular there must be 
mechanism to remove the low angular momentum bar yons (|Dekel 



& Silk|1986||Binney et al.|2001||Governato et al.|20T0"l|Brook et~ 



201 1 ). Understanding what causes the accumulation of low angu- 
lar momentum material in the first place is now added to the list of 
puzzles in this area. 

Our maximum entropy picture gives an interesting framework 
in which to interpret the situation. In Section |33] we gave an exten- 
sive analysis of equation \19\ which suggests two routes to adding 
material at low j. The first option is to appeal to anisotropy (first 
term on the right hand side); the second is to use a population of 
particles at low energy (high Pe in the second term on the right 
hand side). We currently prefer the second explanation for the fol- 
lowing reason. Particles near the centre have very short orbital peri- 
ods, which make them decouple from fluctuations on the dynamical 
timescale of the remainder of the halo. In numerical simulations, 
the cusps are indeed the first part of the halo to form, and they do 
not change much at late times (Moore et al.|1998[ |Syer & White| 
[1998) [Wang et al.|2011) . |Lu et al.| ( |2006^ construct an explicit 2- 
phase model of the formation of halos reflecting this differentia- 
tion, emphasizing the lack of equilibriation between the inner and 
outer parts of the halo (see also |Lapi & Cav aliere 2011 1. Accord- 
ingly a timescale constraint could be incorporated from the outset 
of the maximum entropy argument; we expect this would give sim- 
ilar results to our current approach of fitting a second population. 
This will be tackled explicitly in future work. 

The alternative view is that the behaviour at low j may be 
sensitive to effects of asphericity. This could modify the effective 
degeneracy. But as we commented in Section |3~3] the only obvious 
method available is to pack orbits tightly into a plane, so making the 
phase space available uniform with j, rather than linearly increas- 
ing. Numerical results do show halos become more anisotropic to- 
wards their centre JJing & Suto 2002). On the other hand, when 
given a second population to fit, our code does not select this as a 
viable explanation for the existence of the cusp (Section [3.3} . 

If a full description of the physics generating low angular mo- 
mentum orbits can be reached, the work in this paper lays the foun- 
dation for a complete description of the collisionless equilibria of 
dark matter halos. Further questions of interest will include: 

• whether and how the constraint vector /3 can be derived from 
initial conditions (which will likely lean heavily on an understand- 
ing of the radial orbit insta bility, e.g. | Saha|1991[ MacMilla n et al.| 
[2006[[Beiiovary et al.|2008[|Barnes et al.|2009| >; 

• whether and how maximum entropy arguments can ex- 
plain the po wer-law behaviour of the pseudo-phase-space density 



p{r)/a (r) i Taylor & Navarro 



2001 



• whether and how the phase mixing is maintained to sufficient 
accuracy to make the (j) conservation effective over periods of ma- 
jor disturbance (perhaps through chaotic mixing e.g. |Merritt & Val-] 
|luri|1996|[Henriksen & Widrow|1997> ; 

• how various moments of the distribution function evolve at 
higher order in perturbation theory (which is closely related to the 
previous question); 
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• how the arguments are changed by adopting an explicitly as- 
pherical background Hamiltonian (assuming this is not already an- 
swered when attacking the low- j question); and 

• how maximum entropy arguments can be adjoined to micro- 
physical descriptions of cusp destruction I Navarro et al.|1996a| El-| 



Zant et al.|200T] [Wein berg & K atz|2002| |Read & Gilmore|2005| 
Mashchenko et al. 2006, Pontzen & Governato 2012) to shed fur- 
ther light on this essential area of galaxy formation. 

Our substantial step forward should give confidence that a full 
statistical account of the distribution of particles in simulated dark 
matter halos is achievable without any ad hoc assumptions or mod- 
ifications to the well-established principle of maximum entropy. 
Such an account would be extremely powerful for practical and 
pedagogical aspects of understanding the behaviour of dark matter 
in the Universe. 
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APPENDIX A: WHY MAXIMIZE ENTROPY? 

In this Appendix we return to the question of why we have de- 
rived particle distribution functions by maximizing the entropy {T}. 
We will give an outline of Jaynes' reasoning (e.g. Jaynes & Bret- 
|thorst|200 3 ): that maximizing entropy subject to given constraints 
is equivalent to testing whether those constraints encapsulate the 
physics of the situation. 

We start by outlining two schools of thought explaining why 
maximizing entropy is meaningful. The first relies on the "H- 
theorem" which states that the entropy increases with time, and 
hence systems evolve towards a state which maximizes their en- 
tropy. However the entropy S[f] as definecj^] by equation {T} is 
actually exactly constant in time for a collisionless system. A re- 
lated quantity which does increase with time is the entropy of the 
'coarse-grained' distribution function F. Here, F is discretized and 
equal to / averaged over a local volume in phase space. But there 
are an infinity of functionals of F which increase with time, and 



no obvious reason to favour S[F] over these alternatives l Tremaine 



|etal.|1986| >. 

Similar difficulties extend to collisional systems. In these 
cases it is only the Gibbs entropy that is perfectly conserved, and 
Boltzmann's entropy typically increases with time | jayne sp965| >. 
However there are experimentally accessible cases where the Boltz- 
mann entropy systematically decreases with time (Jaynes| |"l971| >. 
Consequently the justification for expecting systems to adopt max- 
imum entropy states is not clear from the H-theorem even in this 
case. 

The second school of thought states that entropy represents 
human uncertainty about the state a particle will be found in (Jaynes 
[1957] >. In this case, the entropy functional {T} is derived from Shan- 
non's axiom^]- these are reasonable requirements for what 'hu- 
man uncertainty' can mean (Jaynes & Bretthorst 2003 1. The last of 
Shannon's four axioms (which requires additivity of entropy of in- 
dependent systems) has been questioned (e.g. Tsallis 1988 , P lastino| 
|& Plastino| 1 9991. However, no significant improvement in match- 
ing the phase space of simulations has resulted from these devel- 
opments ( |Feron & Hjorth 2008). Moreover, the axiom in question 
can be viewed as requiring "no unwarranted correlations" (see the 
section on kangaroos in |Jaynes|T9 86 1, in the sense that specifying 
constraints on expectations of any variables (a) and (b) will by de- 
fault choose (ab) = unless any other information specifies to the 
contrary. This seems a strongly desirable property. The remainder 
of the discussion therefore focuses on the known properties of the 
Boltzmann entropy. 

There is one outstanding question: why should we maximize 
our uncertainty? It turns out that if we have to choose a state 
based on the given constraints, the vast majority of all possibilities 



4 |Lynd en-Bell| {1967} discusses an exclusion principle which arises from 
Liouville's theorem and can modify the classical expression for entropy; the 
deviations will be significant if the initial phase space density is comparable 
to the density in any regions of a final 'coarse-grained' view of phase space. 
This does not seem likely to apply in cosmological settings jShu|1978) , 
although see Barnes & Williams (2012) for a different view. 
5 For a unique answer we also need to demand that entropy be invariant un- 
der coordinate transformations of the phase space, which reflects our intial 
ignorance of the distribution of particles before the dynamics are specified. 
For an alternative view, see Williams et al. 1 20101 who recently suggested 
that the measure should be uniform in energy space. This is equivalent to 
imposing an a priori preference for some regions of phase space over oth- 
ers, one that should ultimately be derived from the equations of motion (and 
could then be re-expressed as a constraint). 
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Figure Bl. As a test of our assumptions about phase mixing, we can gen- 
erate a density profile of the simulated halo MW (solid line) where phase 
information is thrown away and the mass from each particle is consistently 
'smeared' along its orbit. The result is plotted as a solid curve, and can be 
compared to the direct 'binned' density estimate of the same simulation; the 
agreement is excellent where the profile is reliable (exterior to 4e ~ 700 pc). 



(putting a uniform prior probability on /((£>) at each point in phase 
space) are arbitrarily close to the maximum entropy result ( |Jaynes| 
1 1979a) >. This might be reflected in an 'ergodic' hypothesis that the 
system actually explores all of these states, but it is not necessary 
that this be the case. The key insight of Jaynes is that if the system 
is consistently found in a different state from that predicted, this 
is evidence for a systematic effect. Once a physical model of that 
effect has been built, it can be incorporated as a further constraint 
and the maximum entropy formalism still stands (Jaynes 1979b) . 

Hence if we ask "does maximum entropy subject to these con- 
straints reproduce the numerical distribution function" we are really 
asking "do these constraints encapsulate the important physics of 
dark matter halo collapse?". That is the aim of this work. 



APPENDIX B: DYNAMICAL DENSITY ESTIMATES 



In Section |33] we discussed the real space radial density profiles 
resulting from our ensemble. We now explain how these are cal- 
culated. Starting from expression (12}, the mass enclosed inside a 
radius r is given by 



M(< r)=M jj 6J r djp{J r ,j)P(< r;J r ,j), (Bl) 

where p(J r ,j) is the distribution function marginalized over j z , 

p(J r ,j)= j dj z f(j)=smhp z jexp(-p r J r -Pjj-p E E(J r J)), 

(B2) 

and P(< r;J r , j) gives the fraction of time that a particle with orbital 
parameters (J r ,j) spends interior to radius r: 

P(<r;JJ) o,R e j^&rU(J r ,j)- ' • ( B 3) 



Taking the real part circumvents the need to find apocentre or peri- 
centre explicitly; however the actual numerical evaluation of this 
integral presents some difficulties discussed in Appendix [C] 

The final solution M(< r) depends on <J> [explicitly through 
equation ( |B 3 [ > and implicitly through E(J r ,j) in equation {12}]. A 
full solution thus demands an iterative approach. However, we have 
found that such iteration presents difficult numerical convergence 
problems in cases with Pe 0, with solutions often oscillating 
wildly. While we are working towards a solution to this problem, 
for the present investigation it will be enough to use 4>(r) derived 
from the simulation, and ask whether a maximum entropy popula- 
tion would correctly trace the original density profile. If the answer 
is 'yes' to reasonable accuracy, the answer will automatically be 
self-consistent. 

This raises an interesting test case: one can reinsert the actual 
simulated p{J r ,j) distribution into the procedure and check that the 
results agree with the original density profile. This does not rely on 
any of the maximum entropy arguments, but rather tests numerical 
algorithms and the assumption that the distribution can be approx- 
imated as spherical and in equilibrium (i.e. phased mixed). Failure 
in any aspect would produce density profiles disagreeing with those 
obtained from naive binning in real space. 

To test this we take the calculated (J r ,j) values for all particles 
in a simulated halo and use these as tracers of the p(J r ,j) distribu- 
tion. This throws away all the phase information from the original 
simulation. We then construct the dynamical mass distribution \V>\\ 
using the same method as for the maximum entropy p(J r ,j). 

In Figure [Bl] we show the results of this test applied to the 
MW halo. The recovered profile (solid curve) is in excellent agree- 
ment with that derived from the raw simulation data (shown by 
points) outside the convergence radius 4e ~ 700 pc. This suggests 
that our analytic assumptions are valid and the numerical apparatus 
is working correctly. We have also noted that in low resolution sim- 
ulations (not shown) the recovered profile is significantly smoother 
than a binned profile. This is because the new approach averages 
the profile over a dynamical time; each particle is smeared through 
multiple density bins. This could be a useful technique for mitigat- 
ing Poisson noise when working with limited particle numbers. 



APPENDIX C: CORRECTIONS AT APOCENTRE AND 
PERICENTRE 

To produce density profiles in real space, as explained in Ap- 
pendix [Bl requires rapid, accurate numerical evaluation of equa- 
tion ( |B3fr . The integrand of that expression is plotted for a typi- 
cal particle in Figure |C1| It is relatively flat over a large range, 
and a fast trapezoid quadrature algorithm can therefore be applied. 
However a branch point at either end of the interval means that 
this technique cannot be applied in the endmost bins. Instead, we 
use an analytic approximation as described below, keeping only 
the lowest order terms in r — ro where ro is the branch point (cor- 
responding to apocentre or pericentre). Consider a particle of en- 
ergy E a and angular momentum j a , and write the effective potential 
*e f f, fl W=*W+ja/2r 2 .Then 



(£ a -*eff,aW) 



-1/2, 



dg> e ff, t 

dr 



{r- r 0. 



and so 



(£ fl -*eff») 1/2 dr~2 



d* e ff, a /dr| r(j 



-1/2 



1/2 



(CI) 



(C2) 
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extremely large. Integrating our test case without the correction 
leads to ~ 40% errors in the outermost lOOpc density bins (cen- 
tred on 8.15kpc and 1 1.85kpc), so the effort over this correction is 
worthwhile. 



Figure CI. A plot of the integrand in equation {53} for a sample potential, 
energy and angular momentum. The integrand is relatively flat over most of 
r and can be safely integrated with a low-order scheme such as the trape- 
zoid rule. However at apocentre and pericentre (here, ~ 8.14 and 1 1.85 kpc 
respectively) the integrand diverges. The integral must be evaluated through 
more careful means as explained in the text. 



We can remove the need to find ro explicitly by using the relations 

d*eff. fl 



E -<K eff J R) + {R-r ) 



dr 



;0. 



(C3) 



d*eff. fl 



dr 



d*eff, fl 



dr 



(C4) 



to write our final integral approximation as 



Jrn 



1/2 



dr ■ 



2(£ a -3> en> (i?)) 
d«l> e ff, a /dr| J{ 



1/2 



= /(*), 



(C5) 

in which ro does not appear explicitly. This expression is fast to 
evaluate since all quantities are known exactly; the denominator is 
just the force -GM{< R) /R 2 + j/R 3 . 

The integration of equation < |B3| > over the full range is accom- 
plished in bins. For a bin ro — ► r\ , we have 



P(r < r < n ) « Re (Z(n ) - 7(r )) , 



(C6) 



even if ro or r\ lie outside the physical range. We should apply this 
approximation in those bins for which it is more accurate than the 
trapezium rule. By comparing the leading errors from both meth- 
ods, we established the rule that the alternative integration method 
described above is used when 



Ea - <I>eft> < 



1/3 



(C7) 



where Ar is the bin size used for trapezium quadrature. In the exam- 
ple above, Ar = lOpc and criterion \C1\ is satisfied when apocentre 
or pericentre is nearer than ~ 50pc away. Note however that the er- 
rors in the trapezium method, despite being so localized, become 
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